Guidelines for Data-Parallel Cycle-Stealing in Networks of Workstations
نویسنده
چکیده
We derive guidelines for nearly optimally scheduling data-parallel computations within a draconian mode of cycle-stealing in NOWs. In this computing regimen, workstation A takes control of workstation B’s processor whenever B is idle, with the promise of relinquishing control immediately upon demand—thereby losing work in progress. The typically high communication overhead for supplying workstation B with work and receiving its results militates in favor of supplying B with large amounts of work at a time; the risk of losing work in progress when B is reclaimed militates in favor of supplying B with a succession of small bundles of work. The challenge is to balance these two pressures in a way that maximizes (some measure of) the amount of work accomplished. Our guidelines attempt to maximize the expected work accomplished by workstation B in an episode of cycle-stealing, assuming knowledge of the instantaneous probability of workstation B’s being reclaimed. Our study is a step toward rendering prescriptive the descriptive study of cycle-stealing in [3]. 1. The Cycle-Stealing Problem We derive guidelines for (almost) optimally scheduling data-parallel computations on “borrowed” workstations, within the model developed in [3]. The phenomenological study in that paper builds on the following rather draconian version of cycle-stealing—the use by one workstation of idle computing cycles of another. The owner of workstation A contracts to take control of workstation B whenever its owner is absent. When the owner of B reclaims that workstation, workstation A immediately relinquishes control of B, killing any active job(s)—thereby destroying all work since the last checkpoint. This research was supported in part by NSF Grant CCR-97-10367. Such draconian “contracts” are inevitable, for instance, when a returning owner unplugs a laptop from a network; one encounters such contracts also at several institutions where cycle-stealing is supported. Such a “contract” creates a tension between the following inherently conflicting aspects of cycle-stealing. On the one hand, since any work in progress on workstation B when it is reclaimed is lost, a cycle-stealer wants to break a cyclestealing episode into many short periods, supplying small amounts of work to the borrowed workstation each time. On the other hand, since each of the inter-workstation communications that bracket every period in a cycle-stealing episode—to supply work to workstation B and to reclaim the results of that work—involves an expensive setup protocol, the cycle-stealer wants to break each cycle-stealing episode into a few long periods, supplying large amounts of work to workstation B each time. Clearly, the challenge in scheduling episodes of cycle-stealing is to balance these conflicting factors in a way that maximizes the productive output of the episode. The research we report on here resolves the preceding conflict by deriving scheduling guidelines that (approximately) maximize the expected work1 accomplished within an episode of cycle-stealing, within the following setting. We focus on computations that are dataparallel, in that they consist of a massive number of independent repetitive tasks of known durations. Many scientific computations have this form. We develop schedules assuming that we know the instantaneous probability of workstation B’s being reclaimed and that the function yielding this information is “smooth.” Although our results are stated as though we had exact knowledge of these probabilities, they extend easily to situations wherein this knowledge is approximate, 1In a forthcoming sequel, we focus on (nearly) optimizing other measures of a cycle-stealing episode’s work output.
منابع مشابه
Guidelines for Data-Parallel Cycle-Stealing in Networks of Workstations, II: On Maximizing Guaranteed Output
We derive efficient guidelines for scheduling dataparallel computations within a draconian mode of cyclestealing in networks of workstations wherein an interruption by the owner of the “borrowed” workstation kills all jobs currently in progress. We derive both adaptive and non-adaptive scheduling guidelines that maximize, up to low-order additive terms, the amount of work that one is guaranteed...
متن کاملA Comparison of Two Java Runtime Systems for Parallel Execution of ultithreaded Java Applications on Networks of Workstations
This paper assesses the performance of two Java frameworks for high performance computing (HPC) on networks of workstations (NOWs). The lottery-based work stealing algorithm is intrinsically distributed, and consequetly scalable to an extremely large number of participant workstations. Although proved to be near optimal for the distribution of well-structured multithreaded computations across l...
متن کاملOn Optimal Strategies for Stealing Cycles?
The growing importance of networked workstations as a computing milieu has created a new modality of parallel computing, namely, the possibility of having one workstation \steal cycles" from another. In a typical episode of cycle-stealing, the owner of workstation B allows the owner of workstation A to take control of B's processor whenever it is idle, with the promise of relinquishing control ...
متن کاملLimitations of Cycle Stealing for Parallel Processing on a Network of Homogeneous Workstations
The low cost and availability of clusters of workstations have lead researchers to re-explore distributed computing using independent workstations. This approach may provide better cost/ performance than tightly coupled multiprocessors. In practice, this approach often utilizes wasted cycles to run parallel jobs. In this paper we address the feasibility and limitation of such a nondedicated par...
متن کاملTitle of dissertation : EXPLOITING IDLE CYCLES IN NETWORS OF WORKSTATIONS
Title of dissertation: EXPLOITING IDLE CYCLES IN NETWORS OF WORKSTATIONS Kyung Dong Ryu, Doctor of Philosophy, 2001 Dissertation directed by: Associate Professor Jeffrey K. Hollingsworth Department of Computer Science Studies have shown that workstations are idle a significant fraction of the time. Traditional idle resource harvesting systems define a social contract that permits guest jobs to ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998